AITopics | universal approximation

State-space Models with Layer-wise Nonlinearity are Universal Approximators with Exponential Decaying Memory

Neural Information Processing SystemsFeb-17-2026, 18:40:18 GMT

State-space models have gained popularity in sequence modelling due to their simple and efficient network structures. However, the absence of nonlinear activation along the temporal direction limits the model's capacity.

artificial intelligence, machine learning, state-space model, (19 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

39555391eb0624a439c5131b1bb8a2e0-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 22:23:32 GMT

dependence, hanin and sellke, miller and hardt, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

69ec5030f78a9b735402d133317bf5f6-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 04:43:22 GMT

approximation, polynomial, underdamped langevin dynamic, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > North Carolina > Durham County > Durham (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

2000f6325dfc4fc3201fc45ed01c7a5d-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 18:24:00 GMT

dimension, referee, transport map, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.51)

Add feedback

Universal Approximation of Input-Output Maps by Temporal Convolutional Nets

Neural Information Processing SystemsDec-25-2025, 06:11:17 GMT

There has been a recent shift in sequence-to-sequence modeling from recurrent network architectures to convolutional network architectures due to computational advantages in training and operation while still achieving competitive performance. For systems having limited long-term temporal dependencies, the approximation capability of recurrent networks is essentially equivalent to that of temporal convolutional nets (TCNs). We prove that TCNs can approximate a large class of input-output maps having approximately finite memory to arbitrary error tolerance. Furthermore, we derive quantitative approximation rates for deep ReLU TCNs in terms of the width and depth of the network and modulus of continuity of the original input-output map, and apply these results to input-output maps of systems that admit finite-dimensional state-space realizations (i.e., recurrent models).

input-output map, temporal convolutional net, universal approximation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.42)

Add feedback

Universal Approximation Using Well-Conditioned Normalizing Flows

Neural Information Processing SystemsDec-24-2025, 06:08:15 GMT

Normalizing flows are a widely used class of latent-variable generative models with a tractable likelihood. Affine-coupling models [Dinh et al., 2014, 2016] are a particularly common type of normalizing flows, for which the Jacobian of the latent-to-observable-variable transformation is triangular, allowing the likelihood to be computed in linear time. Despite the widespread usage of affine couplings, the special structure of the architecture makes understanding their representational power challenging. The question of universal approximation was only recently resolved by three parallel papers [Huang et al., 2020, Zhang et al., 2020, Koehler et al., 2020] - who showed reasonably regular distributions can be approximated arbitrarily well using affine couplings - albeit with networks with a nearly-singular Jacobian. As ill-conditioned Jacobians are an obstacle for likelihood-based training, the fundamental question remains: which distributions can be approximated using well-conditioned affine coupling flows? In this paper, we show that any log-concave distribution can be approximated using well-conditioned affine-coupling flows. In terms of proof techniques, we uncover and leverage deep connections between affine coupling architectures, underdamped Langevin dynamics (a stochastic differential equation often used to sample from Gibbs measures) and Hénon maps (a structured dynamical system that appears in the study of symplectic diffeomorphisms). In terms of informing practice, we approximate a padded version of the input distribution with iid Gaussians - a strategy which Koehler et al. [2020] empirically observed to result in better-conditioned flows, but had hitherto no theoretical grounding. Our proof can thus be seen as providing theoretical evidence for the benefits of Gaussian padding when training normalizing flows.

name change, universal approximation, well-conditioned normalizing flow, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.39)

Add feedback

Incremental Generation is Necessity and Sufficient for Universality in Flow-Based Modelling

Rouhvarzi, Hossein, Kratsios, Anastasis

arXiv.org Machine LearningNov-14-2025

Incremental flow-based denoising models have reshaped generative modelling, but their empirical advantage still lacks a rigorous approximation-theoretic foundation. We show that incremental generation is necessary and sufficient for universal flow-based generation on the largest natural class of self-maps of $[0,1]^d$ compatible with denoising pipelines, namely the orientation-preserving homeomorphisms of $[0,1]^d$. All our guarantees are uniform on the underlying maps and hence imply approximation both samplewise and in distribution. Using a new topological-dynamical argument, we first prove an impossibility theorem: the class of all single-step autonomous flows, independently of the architecture, width, depth, or Lipschitz activation of the underlying neural network, is meagre and therefore not universal in the space of orientation-preserving homeomorphisms of $[0,1]^d$. By exploiting algebraic properties of autonomous flows, we conversely show that every orientation-preserving Lipschitz homeomorphism on $[0,1]^d$ can be approximated at rate $\mathcal{O}(n^{-1/d})$ by a composition of at most $K_d$ such flows, where $K_d$ depends only on the dimension. Under additional smoothness assumptions, the approximation rate can be made dimension-free, and $K_d$ can be chosen uniformly over the class being approximated. Finally, by linearly lifting the domain into one higher dimension, we obtain structured universal approximation results for continuous functions and for probability measures on $[0,1]^d$, the latter realized as pushforwards of empirical measures with vanishing $1$-Wasserstein error.

artificial intelligence, homeomorphism, machine learning, (17 more...)

arXiv.org Machine Learning

2511.09902

Country: North America > Canada > Ontario (0.28)

Genre: Research Report (0.82)

Industry: Government (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

ea8608c6258450e75b3443ec8022fb2e-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 10:46:55 GMT

artificial intelligence, machine learning, state-space model, (19 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

On the equivalence between graph isomorphism testing and function approximation with GNNs

Zhengdao Chen, Soledad Villar, Lei Chen, Joan Bruna

Neural Information Processing SystemsOct-2-2025, 23:53:11 GMT

In light of this, there has been increasing interest in studying their representation power.

approximation, graph, neural network, (13 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.41)

Add feedback

Supplementary material

Neural Information Processing SystemsOct-2-2025, 20:42:12 GMT

Appendix B proves universal approximation of the Neural CDE model, and is substantially more technical than the rest of this paper. Appendix C proves that the Neural CDE model subsumes alternative ODE models which depend directly and nonlinearly on the data. Appendix D gives the full details of every experiment, such as choice of optimiser, hyperparameter searches, and so on. To evaluate the model as discussed in Section 3.2, X must be at least continuous and piecewise differentiable. A.1 Differentiating with respect to the time points However, there is a technical caveat in the specific case that derivatives with respect to the initial time t A.2 Adaptive step size solvers There is one further caveat that must be considered.

artificial intelligence, machine learning, neural cde model, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

universal approximation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

State-space Models with Layer-wise Nonlinearity are Universal Approximators with Exponential Decaying Memory

39555391eb0624a439c5131b1bb8a2e0-AuthorFeedback.pdf

69ec5030f78a9b735402d133317bf5f6-Paper.pdf

2000f6325dfc4fc3201fc45ed01c7a5d-AuthorFeedback.pdf

Universal Approximation of Input-Output Maps by Temporal Convolutional Nets

Universal Approximation Using Well-Conditioned Normalizing Flows

Incremental Generation is Necessity and Sufficient for Universality in Flow-Based Modelling

ea8608c6258450e75b3443ec8022fb2e-Paper-Conference.pdf

On the equivalence between graph isomorphism testing and function approximation with GNNs

Supplementary material